فهرست مطالب

Artificial Intelligence and Data Mining - Volume:7 Issue: 4, Autumn 2019

Journal of Artificial Intelligence and Data Mining
Volume:7 Issue: 4, Autumn 2019

  • تاریخ انتشار: 1398/08/10
  • تعداد عناوین: 12
|
  • J. Darvish, M. Ezoji * Pages 487-493
    Diabetic retinopathy lesion detection such as exudate in fundus image of retina can lead to early diagnosis of the disease. Retinal image includes dark areas such as main blood vessels and retinal tissue and also bright areas such as optic disk, optical fibers and lesions e.g. exudate. In this paper, a multistage algorithm for the detection of exudate in foreground is proposed. The algorithm segments the background dark areas in the proper channels of RGB color space using morphological processing such as closing, opening and top-hat operations. Then an appropriate edge detector discriminates between exudates and cotton-like spots or other artificial effects. To tackle the problem of optical fibers and to discriminate between these brightness and exudates, in the first stage, main vessels are detected from the green channel of RGB color space. Then the optical fiber areas around the vessels are marked up. An algorithm which uses PCA-based reconstruction error is proposed to discard another fundus bright structure named optic disk. Several experiments have been performed with HEI-MED standard database and evaluated by comparing with ground truth images. These results show that the proposed algorithm has a detection accuracy of 96%.
    Keywords: Exudate Detection, Retinal Image Analysis, Diabetic Retinopathy, Biomedical Image Processing
  • A. Noruzi, M. Mahlouji *, A. Shahidinejad Pages 495-506
    A biometric system provides automatic identification of an individual based on a unique feature or characteristic possessed by him/her. Iris recognition (IR) is known to be the most reliable and accurate biometric identification system. The iris recognition system (IRS) consists of an automatic segmentation mechanism which is based on the Hough transform (HT). This paper presents a robust IRS in unconstrained environments. Through this method, first a photo is taken from the iris, then edge detection is done, later on a contrast adjustment is persecuted in pre-processing stage. Circular HT is subsequently utilized for localizing circular area of iris inner and outer boundaries. The purpose of this last stage is to find circles in imperfect image inputs. Also, through applying parabolic HT, boundaries are localized between upper and lower eyelids. The proposed method, in comparison with available IRSs, not only enjoys higher accuracy, but also competes with them in terms of processing time. Experimental results on images available in UBIRIS, CASIA and MMUI database show that the proposed method has an accuracy rate of 99.12%, 98.80% and 98.34%, respectively.
    Keywords: Hough transform, Biometric identification, Segmentation, Normalization, Matching
  • V. Naghashi *, Sh. Lotfi Pages 507-519
    Image segmentation is a fundamental step in many of image processing applications. In most cases the image’s pixels are clustered only based on the pixels’ intensity or color information and neither spatial nor neighborhood information of pixels is used in the clustering process. Considering the importance of including spatial information of pixels which improves the quality of image segmentation, and using the information of the neighboring pixels, causes enhancing of the accuracy of segmentation. In this paper the idea of combining the K-means algorithm and the Improved Imperialist Competitive algorithm is proposed. Also before applying the hybrid algorithm, a new image is created and then the hybrid algorithm is employed. Finally, a simple post-processing is applied on the clustered image. Comparing the results of the proposed method on different images, with other methods, shows that in most cases, the accuracy of the NLICA algorithm is better than the other methods.
    Keywords: Image segmentation, Clustering, Improved Imperialist Competitive Algorithm, post-processing, Berkley images dataset
  • A. R. Yamghani, F. Zargari * Pages 521-535
    Video abstraction allows searching, browsing and evaluating videos only by accessing the useful contents. Most of the studies are using pixel domain, which requires the decoding process and needs more time and process consuming than compressed domain video abstraction. In this paper, we present a new video abstraction method in H.264/AVC compressed domain, AVAIF. The method is based on the normalized histogram of extracted I-frame prediction modes in H.264 standard. The frames’ similarity is calculated by intersecting their I-frame prediction modes’ histogram. Moreover, fuzzy c-means clustering is employed to categorize similar frames and extract key frames. The results show that the proposed method achieves on average 85% accuracy and 22% error rate in compressed domain video abstraction, which is higher than the other tested methods in the pixel domain. Moreover, on average, it generates video key frames that are closer to human summaries and it shows robustness to coding parameters.
    Keywords: Video Abstraction, Clustering, Prediction modes’ Histogram, Compressed Video, Keyframe Extraction
  • A. Soltani *, M. Soltani Pages 537-550
    High utility itemset mining (HUIM) is a new emerging field in data mining which has gained growing interest due to its various applications. The goal of this problem is to discover all itemsets whose utility exceeds minimum threshold. The basic HUIM problem does not consider length of itemsets in its utility measurement and utility values tend to become higher for itemsets containing more items. Hence, HUIM algorithms discover a huge enormous number of long patterns. High average-utility itemset mining (HAUIM) is a variation of HUIM that selects patterns by considering both their utilities and lengths. In the last decades, several algorithms have been introduced to mine high average-utility itemsets. To speed up the HAUIM process, here a new algorithm is proposed which uses a new list structure and pruning strategy. Several experiments performed on real and synthetic datasets show that the proposed algorithm outperforms the state-of-the-art HAUIM algorithms in terms of runtime and memory consumption.
    Keywords: data mining, Frequent Pattern, Utility, High Average-Utility itemset
  • M. Owhadi Kareshki, M.R. Akbarzadeh T. * Pages 551-561

    The increasingly larger scale of available data and the more restrictive concerns on their privacy are some of the challenging aspects of data mining today. In this paper, Entropy-based Consensus on Cluster Centers (EC3) is introduced for clustering in distributed systems with a consideration for confidentiality of data; i.e. it is the negotiations among local cluster centers that are used in the consensus process, hence no private data are transferred. With the proposed use of entropy as an internal measure of consensus clustering validation at each machine, the cluster centers of the local machines with higher expected clustering validity have more influence in the final consensus centers. We also employ relative cost function of the local Fuzzy C-Means (FCM) and the number of data points in each machine as measures of relative machine validity as compared to other machines and its reliability, respectively. The utility of the proposed consensus strategy is examined on 18 datasets from the UCI repository in terms of clustering accuracy and speed up against the centralized version of FCM. Several experiments confirm that the proposed approach yields to higher speed up and accuracy while maintaining data security due to its protected and distributed processing approach.

    Keywords: Consensus Clustering, Distributed Clustering, Ensemble learning, Entropy
  • H. Hosseinpour *, Seyed A. Moosavie Nia, M. A. Pourmina Pages 563-574
    Virtual view synthesis is an essential part of computer vision and 3D applications. A high-quality depth map is the main problem with virtual view synthesis. Because as compared to the color image the resolution of the corresponding depth image is low. In this paper, an efficient and confided method based on the gradual omission of outliers is proposed to compute reliable depth values. In the proposed method depth values that are far from the mean of depth values are omitted gradually. By comparison with other state of the art methods, simulation results show that on average, PSNR is 2.5dB (8.1 %) higher, SSIM is 0.028 (3%) more, UNIQUE is 0.021 (2.4%) more, the running time is 8.6s (6.1%) less and wrong pixels are 1.97(24.8%) less.
    Keywords: Virtual view synthesis, epipolar line, reliable depth, gradual omission of outliers, Hole filling
  • N. Khozouie, F. Fotouhi Ghazvini *, B. Minaei Pages 575-588
    Context-aware systems must be interoperable and work across different platforms at any time and in any place. Context data collected from wireless body area networks (WBAN) may be heterogeneous and imperfect, which makes their design and implementation difficult. In this research, we introduce a model which takes the dynamic nature of a context-aware system into consideration. This model is constructed according to the four-dimensional objects approach and three-dimensional events for the data collected from a WBAN. In order to support mobility and reasoning on temporal data transmitted from WBAN, a hierarchical model based on ontology is presented. It supports the relationship between heterogeneous environments and reasoning on the context data for extracting higher-level knowledge. Location is considered a temporal attribute. To support temporal entity, reification method and Allen’s algebra relations are used. Using reification, new classes Time_slice and Time_Interval and new attributes ts_time_slice and ts_time_Interval are defined in context-aware ontology. Then the thirteen logic relations of Allen such as Equal, After, Before is added by OWL-Time ontology to the properties. Integration and consistency of context-aware ontology are checked by the Pellet reasoner. This hybrid context-aware ontology is evaluated by three experts using the FOCA method based on the Goal-Question-Metrics (GQM) approach. This evaluation methodology diagnoses the ontology numerically and decreases the subjectivity and dependency on the evaluator’s experience. The overall performance quality according to completeness, adaptability, conciseness, consistency, computational efficiency and clarity metrics is 0.9137.
    Keywords: Hybrid context-aware modeling, Ontology model, Spatio-temporal data
  • M. Tajamolian, M. Ghasemzadeh * Pages 589-596
    In order to achieve the virtual machines live migration, the two "pre-copy" and "post-copy" strategies are presented. Each of these strategies, depending on the operating conditions of the machine, may perform better than the other. In this article, a new algorithm is presented that automatically decides how the virtual machine live migration takes place. In this approach, the virtual machine memory is considered as an informational object that has a revision number and it is constantly changing. We have determined precise criteria for evaluating the behavior of a virtual machine and automatically select the appropriate live migration strategy. Also in this article, different aspects of required simulations and implementations are considered. Analytical evaluation shows that using the proposed scheme and the presented algorithm, can significantly improve the virtual machines live migration process.
    Keywords: Virtual Machine Live Migration_Pre-Copy & Post-Copy_Quadruple Adaptive Version Numbering Scheme_Decision Making Algorithm_Cloud Computing
  • F. Nosratian, H. Nematzadeh *, H. Motameni Pages 597-606
    World Wide Web is growing at a very fast pace and makes a lot of information available to the public. Search engines used conventional methods to retrieve information on the Web; however, the search results of these engines are still able to be refined and their accuracy is not high enough. One of the methods for web mining is evolutionary algorithms which search according to the user interests. The proposed method based on genetic algorithm optimizes important relationships among links on web pages and also presented a way for classifying web documents. Likewise, the proposed method also finds the best pages among searched ones by engines. Also, it calculates the quality of pages by web page features independently or dependently. The proposed algorithm is complementary to the search engines. In the proposed methods, after implementation of the genetic algorithm using MATLAB 2013 with crossover rate of 0.7 and mutation rate of 0.05, the best and the most similar pages are presented to the user. The optimal solutions remained fixed in several running of the proposed algorithm.
    Keywords: Genetic Algorithm, web miming, evolutionary computation
  • A. Zangooei, V. Derhami, F. Jamshidi * Pages 607-616
    Phishing is one of the luring techniques used to exploit personal information. A phishing webpage detection system (PWDS) extracts features to determine whether it is a phishing webpage or not. Selecting appropriate features improves the performance of PWDS. Performance criteria are detection accuracy and system response time. The major time consumed by PWDS arises from feature extraction that is considered as feature cost in this paper. Here, two novel features are proposed. They use semantic similarity measure to determine the relationship between the content and the URL of a page. Since suggested features don't apply third-party services such as search engines result, the features extraction time decreases dramatically. Login form pre-filer is utilized to reduce unnecessary calculations and false positive rate. In this paper, a cost-based feature selection is presented as the most effective feature. The selected features are employed in the suggested PWDS. Extreme learning machine algorithm is used to classify webpages. The experimental results demonstrate that suggested PWDS achieves high accuracy of 97.6% and short average detection time of 120.07 milliseconds.
    Keywords: Cost-based feature selection, Extreme learning machine, Phishing, Semantic similarity, Term Frequency, Inverse Document Frequency (TF-IDF)
  • S. Taherian Dehkordi, A. Khatibi Bardsiri *, M. H. Zahedi Pages 617-630
    Data mining is an appropriate way to discover information and hidden patterns in large amounts of data, where the hidden patterns cannot be easily discovered in normal ways. One of the most interesting applications of data mining is the discovery of diseases and disease patterns through investigating patients' records. Early diagnosis of diabetes can reduce the effects of this devastating disease. A common way to diagnose this disease is performing a blood test, which, despite its high precision, has some disadvantages such as: pain, cost, patient stress, lack of access to a laboratory, and so on. Diabetic patients’ information has hidden patterns, which can help you investigate the risk of diabetes in individuals, without performing any blood tests. Use of neural networks, as powerful data mining tools, is an appropriate method to discover hidden patterns in diabetic patients’ information. In this paper, in order to discover the hidden patterns and diagnose diabetes, a water wave optimization(WWO) algorithm; as a precise metaheuristic algorithm, was used along with a neural network to increase the precision of diabetes prediction. The results of our implementation in the MATLAB programming environment, using the dataset related to diabetes, indicated that the proposed method diagnosed diabetes at a precision of 94.73%,sensitivity of 94.20%, specificity of 93.34%, and accuracy of 95.46%, and was more sensitive than methods such as: support vector machines, artificial neural networks, and decision trees.
    Keywords: Diabetes Mellitus, data mining, Artificial Neural Networks, Water Wave Optimization (WWO) Algorithm